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Abstract 

Background: Endoparasitoid wasps are important natural enemies of the widely distributed aphid pests and are 
mainly used as biological control agents. However, despite the increased interest on aphid interaction networks, 
only sparse information is available on the factors used by parasitoids to modulate the aphid physiology. Our aim 
was here to identify the major protein components of the venom injected at oviposition by Aphidius ervi to ensure 
successful development in its aphid host, Acyrthosiplion pisum. 

Results: A combined large-scale transcriptomic and proteomic approach allowed us to identify 16 putative venom 
proteins among which three y-glutamyl transpeptidases (y-GTs) were by far the most abundant. Two of the y-GTs 
most likely correspond to alleles of the same gene, with one of these alleles previously described as involved in 
host castration. The third y-GT was only distantly related to the others and may not be functional owing to the 
presence of mutations in the active site. Among the other abundant proteins in the venom, several were unique to 
A. ervi such as the molecular chaperone endoplasmin possibly involved in protecting proteins during their secretion 
and transport in the host. Abundant transcripts encoding three secreted cystein-rich toxin-like peptides whose function 
remains to be explored were also identified. 

Conclusions: Our data further support the role of y-GTs as key players in A. ervi success on aphid hosts. However, they 
also evidence that this wasp venom is a complex fluid that contains diverse, more or less specific, protein components. 
Their characterization will undoubtedly help deciphering parasitoid-aphid and parasitoid-aphid-symbiont interactions. 
Finally, this study also shed light on the quick evolution of venom components through processes such as duplication 
and convergent recruitment of virulence factors between unrelated organisms. 

Keywords: Parasitoid wasp. Aphid, Acyrtliosiplion pisum, Aphidius ervi, Venom proteins. Virulence, y-glutamyl 
transpeptidase, Cystein-rich peptides 



Background 

Aphids are Hemipteran pests responsible for major agri- 
cultural losses, notably due to vectored viral pathogens. 
They also have peculiar and poorly understood ecological 
and evolutionary features, which offer unparalleled oppor- 
tunities to address evolutionary issues. More particularly. 
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their tight association with bacterial symbionts makes 
them an ideal model to study the evolution of the immune 
system and the modulation of immune interactions [1-3]. 
Aphids can be attacked by various natural antagonists in- 
cluding endoparasitoid braconid wasps from the subfamily 
Aphidiinae. These solitary parasitic wasps lay eggs inside 
the body of host juvenile stages or adults. The hatching 
larva then develop through three larval stages to become a 
pupa, protected inside the hardened host body called 
"mummy", from which an adult wasp will emerge [4]. 
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Aphidius ervi is a widely used biological control agent 
that parasitizes several Macrosiphinae aphid species, in- 
cluding the pea aphid model Acyrthosiphon pisum [5,6]. 
To ensure development inside the host, A. ervi regulates 
its development and metabolism and possibly evades or 
overcomes its immune response. Its success relies on the 
injection of venom at oviposition, as well as the release in 
the host of teratocytes, cells that derive from the dissoci- 
ation of a membrane surrounding the embryo [7-10]. 
Until now, the physiological effects observed in the host 
are mainly associated with parasitoid nutrition. Venom 
injection, for instance, induces the degeneration of host 
ovaries and the arrest of its reproduction, thus redirecting 
host nutritional resources to the developing parasitoid 
larva [11-13]. In contrast, egg encapsulation has seldom 
been reported for aphid parasitoids and whether they may 
suppress or evade host immune response, as described for 
most parasitoids of Diptera and Lepidoptera [14], remains 
to be determined. 

Despite the high amount of data on A. ervi behavior 
and physiology, only sparse information is yet available 
on its venom molecular composition. More surprisingly, 
there are no data on venom of other parasitoids of aphids 
and more generally of hemipteran hosts although they in- 
clude many pests of remarkable economic importance. 
Until now, the only factor identified from the venom of 
A. ervi is a y-glutamyl transpeptidase (y-GT) that was 
named Ae-y-GT [13,15]. y-GTs enzymes play a pivotal 
role in glutathione metabolism by hydrolyzing and 
transferring the y-glutamyl moiety from glutathione 
(GSH) to various acceptors [16]. Although y-GTs are 
usually membrane-bound proteins, Ae-y-GT was found 
as a soluble enzyme of 57 kDa (36 and 19 kDa subunits) 
in venom. It was also shown to be involved in castra- 
tion of its aphid host possibly because it may interfere 
with the delicate balance of glutathione, causing oxida- 
tive stress in ovarian cells and triggering fatal apoptosis 
of ovaries and early aphid embryos [13]. 

To identify the main A. ervi venom protein compo- 
nents, we performed a large-scale analysis using a com- 
bined transcriptomic and proteomic approach. Such 
broad approaches recently allowed thorough investiga- 
tions of venom components in several parasitoid spe- 
cies, thus improving our knowledge of their nature and 
diversity [17-26]. The present study is the first in-depth 
venom analysis of a parasitoid of Hemiptera, as well as of 
a braconid parasitoid devoid of polydnaviruses (PDVs), 
key factors of host regulation in several braconid and 
ichneumonid species [27]. Comparison of venom data 
sets for A. ervi and PDV-associated braconid wasps, 
such as Chelonus inanitus [25], Microctonus sp. [19] 
and Microplitis demolitor [17], wUl provide insights on 
how the use of various parasitism strategies impacts 
venom evolution. Although we identified a large number 



of transcripts and proteins, we have focused our analysis 
on the major venom components since they are the most 
likely involved in parasitism success [18,20]. 

Results and discussion 

Identification of the main secreted proteins in A. ervi 
venom through a combined transcriptomic and 
proteomic approach 

The transcriptomic analysis was performed on a French 
(FR) and an Italian (IT) A. ervi strain, using cDNA libraries 
from venom apparatus (glands and associated reservoirs). 
As our objective was to identify the major venom pro- 
teins, and since no reference genome was available, we 
decided to use the Sanger technology to produce long, 
high quality sequences (Additional file 1: Figure SI). The 
obtained number of sequences was approximately five 
times higher for the FR library than for the IT library 
(Additional file 2: Table SI). Tests of assembly per- 
formed on the pool of all IT and FR ESTs, using differ- 
ent parameters, revealed that a large part of the ESTs 
were shared between the IT and FR libraries. Moreover, 
GO terms comparison on the trimmed ESTs suggested 
a similar distribution for the two libraries (Additional 
file 3: Figure S2). The final assembly, therefore made 
using all pooled ESTs and default parameters, yielded a 
total of 1911 unisequences (unique sequences correspond- 
ing to either contigs or singletons), with a high level of 
redundancy (Additional file 2: Table SI). As expected from 
the relative number of sequences, a majority of IT ESTs 
(58%) were found in mixed contigs, whereas a majority of 
FR ESTs (61%) were found in the FR library only (Additional 
file 2: Table SI). Among the 42 abundant transcripts 
(represented by more than 10 ESTs), nearly 80% were 
mixed contigs suggesting a rather similar venom com- 
position in the A. ervi strains (Additional file 2: Table SI). 
Functional annotation was performed using (i) sequence 
similarity searches against public databases as well as the 
main available predicted insect proteomes and (ii) auto- 
mated open reading frame (ORE) prediction, followed by 
search for signal peptide and InterPro domains on the 
translated sequences (Additional file 1: Figure SI). As 
already evidenced in previous venom analyses [18,22,25], 
more than 60% of unisequences had no significant similar- 
ity in databases and could not be assigned an InterPro 
annotation (Additional file 2: Table SI and Additional file 4: 
Table S2). 

The proteomic analysis was performed on the A. ervi 
FR strain, on venom gland and reservoir samples separ- 
ately. On a 6-16% SDS-PAGE, the protein content of 
each compartment was resolved in bands from less than 
15 kDa to more than 250 kDa (apparent molecular mass, 
Figure 1). As expected, most of the major bands ob- 
served in the venom glands were also detected in the 
reservoirs, despite an overall quantitative difference in 
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Figure 1 Comparison of venom gland and reservoir protein 
profiles, and proteomic analysis. Proteins from A. ervi venom 
glands and reservoirs were separated on a 6-16% SDS-PAGE under 
reducing conditions and visualized by silver staining. All stained protein 
bands numbered on the gel were excised and submitted for protein 
identification by LC-MS-MS. Molecular mass is in kDa. 



protein load between these tissues due to the small 
amount of venom in the reservoirs. All the major bands 
on electrophoretic patterns as well as a number of minor 
bands (a total of 30 bands for the venom glands, 10 for the 
venom reservoirs) were excised, and tryptic peptides were 
analyzed by LC-MS-MS (Additional fde 5: Table S3). The 
integrated analysis of transcriptomic and proteomic data 
resulted in 86 matches, among which a putative function 
was found for 65 unisequences (Table 1 and Additional 
file 6: Table S4). However, most of the unisequences 
found in proteomics were detected in the venom gland 
only (Figure 2), and many of them probably corre- 
sponded to cellular proteins (e.g. ribosomal proteins) 
(Table 1 and Additional file 6: Table S4). Some of the 
cellular proteins also found in the reservoir had a pre- 
dicted muscular function (e.g. actin, with as much as 30 



peptide matches, paramyosin and spectrin; Additional 
file 6: Table S4), thus supporting the main role of the 
reservoir in pumping and injecting venom during ovipos- 
ition. The presence of cellular proteins likely resulted from 
tissue contamination since cell leakage was difficult to 
prevent during venom collection from the gland (two 
filaments with a thin canal; Additional file 1: Figure SI), 
while venom could only be extracted from the reservoir 
by crushing the tissues. Although this likely resulted in 
under-evaluating their number, we therefore only con- 
sidered as putative venom proteins the 16 unisequences 
(i) found in proteomics in both venom glands and reser- 
voirs and (ii) predicted to be secreted or for which secre- 
tion could not be predicted due to the incompleteness of 
the sequence (Figure 2 and Table 2). 

Putative function of the main identified A. ervi venom 
proteins 

A putative function was predicted for 12 of the 16 unise- 
quences considered as venom proteins (Table 2 and 
Additional file 6: Table S4). Among them, 7 sequences 
were considered as abundant based on the number of 
ESTs. Moreover, we generally observed a good correlation 
between the number of ESTs and the number of matches 
with mass spectrometry peptides, although the proteomic 
analysis was not strictly quantitative. The abundant unise- 
quences were (i) 3 y-GTs, (ii) 1 serine protease homologue, 
(iii) 1 leucine rich repeat domain-containing protein, (iv) 1 
serpin and (v) 1 endoplasmin (Table 2 and Additional 
file 6: Table S4). Real-time PCR analysis of the relative 
expression of a selection of these unisequences (1 y-GT, 
the serine protease homologue, the serpin) evidenced a 
venom tissue-specific expression (Table 3 and Additional 
file 7: Figure S3), as expected for putative venom proteins. 

y-glutamyl transpeptidases 

Our analysis led to identification of three different y-GTs 
in A. ervi venom (Additional file 8: Figure S4), including 
Ae-y-GT, which represent by far the most abundant pro- 
teins (Table 2 and Additional file 6: Table S4). y-GTs are 
found in bacteria, plants, and animals. They are key- 
enzymes in glutathione (GSH) homeostasis that catalyze 
the transfer of the y-glutamyl moiety from GSH, as well as 
other y-glutamyl compounds, to amino acids or GSH itself 
[16]. y-GTs thus play an important role in intracellular 
redox status, cytosolic iron metabolism, and inflamma- 
tion. Although considered as heterodimeric cell-surface 
enzymes, y-GTs are also found under soluble circulating 
forms in body fluids, as Ae-y-GT in A. ervi venom [13]. 
Accordingly, all three A. ervi venom y-GTs identified in 
our analysis were predicted to contain a peptide signal 
and thus could be secreted or shed from the cell surface 
(Table 2 and Additional file 6: Table S4). 
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Table 1 Classification of unisequences found in proteomics according to putative function. Putative venom proteins 
are highlighted in italic 



Putative function 




Unisequences^ 






est'' 




Mascot" 


Signal 
peptide'' 


Total 


FR-l-IT FR 


IT 


Total 


FR 


IT 


VG R 


14-3-3 family 


1 


1 




7 


7 




1 


No 


Actin 


2 




2 


2 




2 


3 30 


No 


Calpain 


2 


2 




3 


3 




2 


7 


Elongation factor 1 -alpha 


1 


1 




4 


4 




1 


7 


Elongation factor 2 


1 


1 




6 


6 




3 1 


7 


Endoplasmin 


3 


2 1 




19 


16 


3 


60 8 


Yes" 


Fatty acid synthase 


1 


1 




1 


1 




1 


7 


y-glutamyl transpeptidase 


3 


3 




539 


463 


76 


177 25 


Yes 


Glycoside hydrolase domain-containing protein 


2 


2 




15 


7 




2 


7 


Heat shock protein 70 


6 


2 4 




25 


22 


3 


59 


Yes* 


Hypoxia up-regulated protein 1 


1 


1 




2 


2 




1 


7 


nositol-3-phosphate synthase 


2 


2 




2 


2 




2 


7 


Leucine ricli repeat domain-containing protein 


2 


1 1 




30 


28 


2 


12 4 


Yes* 


Low-density lipoprotein receptor-related protein 2 


1 


1 




13 


12 


1 


1 


7 


Neprilysin-like 


1 


1 




2 


2 




1 2 


7 


Paramyosin, long form-like 


1 




1 


1 




1 


1 


7 


Peptidyl-prolyl cis-trans isomerase 


1 


1 




3 


3 




4 


7 


Protein disulfide isomerase 


2 


2 




2 


2 




8 


7 


Rab GTPase family 


1 


1 




2 


2 




2 


7 


Ribosomal protein 


19 


3 14 


2 


42 


36 


6 


30 


7 


Serine protease liomologue 


5 


3 2 




97 


78 


19 


22 1 


Yes* 


Serpin 


2 


2 




26 


26 




10 3 


Yes* 


Spectrin 


1 


1 










1 


7 


Staphylococcal nuclease domain-containing protein 


1 


1 










2 


7 


Transcription factor BTF3-like 


1 


1 










1 


7 


V-type proton ATPase catalytic subunit A 


1 


1 










1 


No 


Vesicular integral-membrane protein VIP36-like 


1 


1 










1 


7 



^Uniseq: number of unisequences. 

"EST: number of ESTs. 

"^Mascot: number of matches with peptides. 

^SP: Prediction of peptide signal by TargetP. A ? means that prediction of secretion could not be performed due to the incompleteness of the sequence(s). 
A # Indicates that a SP was predicted for some but not all unisequences. 



Among the three A. ervi y-GTs, the amino acid se- 
quences of CLlContig2 and CLlContig7 were respect- 
ively identical and very close (87% identity) to the 
Ae-y-GT previously published sequence (Figure 3). Inter- 
estingly, CLlContig2 was the most abundant venom 
y-GT in the Italian strain, while CLlContig? was the 
most abundant in the French strain (Table 2 and 
Additional file 6: Table S4) suggesting they might be 
alleles occurring at different frequencies. This is in agree- 
ment with our observation of a rapid decrease in the fre- 
quency of CLlContig2 in the French strain under our 
rearing conditions (data not shown). Whether these two 
y-GTs similarly contribute to induce apoptosis in the pea 



aphid ovarian cells remains to be investigated. Strik- 
ingly, CLlContig6, which is highly expressed in the 
French strain, shares only 51% identity to the published 
sequence (Figure 3). Moreover, although overexpressed 
in the venom apparatus (Table 3 and Additional file 7: 
Figure S3), it contains two mutations previously described 
to strongly reduce the enzymatic activity of human y-GTl 
[28,29]. The corresponding residues are otherwise con- 
served in other hymenopteran GGT sequences belonging 
to the same clade (Figures 3, 4, and Additional file 9: 
Figure S5). This raises the questions whether it is a fully 
active y-GT and which role it may play in A. ervi parasit- 
ism success. 
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Using Hymenopteran databases, we were able to iden- 
tify three distinct types of non-venomous y-GTs forming 
three distinct phylogenetic clades (Figure 4). Interestingly, 
the three A. ervi venomous y-GTs group with clade A, 
while the two y-GTs recently identified in N. vitripennis 
venom, which function is yet unknown [22], group with 
clade B (Figure 4). The venomous y-GTs of these parasit- 
oid species may thus originate from distinct duplication 
events of two different genes coding for y-GT "classical" 
proteins. 

Interestingly, y-GTs can be used by bacteria as viru- 
lence determinants. For instance, a y-GT contributes to 
Helicobacter pylori tolerizing effects on murine dendritic 
cells and suppressive activity on T cells, mainly via the 
depletion of glutamine [30]. Campylobacter jejuni viru- 
lence and colonization of the avian gut is also dependent 
upon the activity of a y-GT that participates in its cell 
apoptosis-inducing activity by a yet unknown mechan- 
ism [31]. Whether venom y-GTs can also act as viru- 
lence factors in parasitoids remains to be assessed. 

Serine protease homologues 

Serine proteases are endopeptidases whose active site 
contains a serine and which are involved in various bio- 
logical processes, including immunity. Serine protease ho- 
mologues (SPHs) lack one or more residues essential for 
catalytic activity [32] and do not have proteolytic 



Table 2 Putative venom proteins classified according to the number of ESTs 



Sequence name 


Putative function 




Esr 




Mascot'' 




Signal 
peptide"^ 






Total 


FR 


IT 


VG 


R 


CLIContig? 


Y-glutamyl transpeptidase 


319 


317 


2 


133 


16 


Yes 


CL1Contig2 


y-glutamyl transpeptidase 


120 


50 


70 


33 


8 


Yes 


CLlContig6 


y-glutamyl transpeptidase 


100 


96 


4 


11 


1 


Yes 


CL9Contigl 


Serine protease homologue 


41 


34 


7 


2 


1 


Yes 


CL2Contigl 




30 


24 


6 


18 


1 


Yes 


CLISContigl 


Serpin 


25 


25 




9 


2 


Yes 


CL3Contig3 


Leucine rich repeat domain-containing 
protein 


23 


21 


2 


7 


3 


Yes 


CL28Contig1 


Endoplasmin 


15 


13 


2 


52 


6 


Yes 


CL2Contigl1 




14 


14 




18 


1 


? 


CL3Contig5 


Leucine rich repeat domain-containing 
protein 


7 


7 




5 


1 


7 


CL56Contig1 


Elongation factor 2 


6 


6 




3 


1 


7 


CL257Contig1 


Neprilysin-like 


2 


2 




1 


2 


7 


CL209Contig1 


Endoplasmin 


2 


2 




6 


2 


7 


CL296Contig1 




2 


2 




3 


1 


7 


aar0aka7ya 15cm 1.1 




1 


1 




7 


1 


7 


aar0aka8ya02cml.l 


Serpin 


1 


1 




1 


1 


7 



^EST: number of ESTs. 

"^Mascot: number of matches with peptides. 

■^SP: Prediction of peptide signal by TargetP. A ? means that prediction of secretion could not be performed due to the Incompleteness of the sequence. 




82 24 

Figure 2 Venn diagram showing the repartition of 
unisequences found in proteomics between venom glands (VG) 
and reservoirs (R). The green and red rectangles highlight the 
number of unisequences for which sequence was complete and 
that were predicted to be secreted (S) or predicted not to be secreted 
(NS), respectively. The blue ellipse corresponds to considered "putative 
venom proteins". 
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Table 3 Mean relative expression In venom apparatus and bodies without venom apparatus for a selection of 
unisequences coding for putative venom proteins and toxin-like peptides 



Unisequence name 


Putative function 


Number of ESTs 


Venom apparatus (SE) 


Bodies w/o venom apparatus (SE) 


CL1Contig6 


Y-glutamyl transpeptidase 


100 


811.62 (107.47) 


2.51 (1.42) 


CL9Contig1 


Serine protease Inomologue 


41 


562.35 (6.93) 


1 .87 (0.9) 


CLlSContigl 


Serpin 


25 


637.9 (86.78) 


2.77 (1.75) 


CLlContig4 


Toxin-lil<e 


60 


451.74 (39.01) 


1.91 (1.07) 


CLlContigl 


Toxin-lil<e 


33 


439.92 (1 7.43) 


2.17 (1.22) 


CLIContigS 


Toxin-like 


9 


1057.45 (132.07) 


1 .87 (0.9) 



Aerv_GGT 
Aerv_CLlCont ig2 
Aerv_CLlCont ig7 
Aerv_CLlCont ig6 
Hsap_GGTl 




Aerv_ 


GGT 


54 


GGSA 


Aerv_ 


CLlCont ig2 


54 


GGSA 


Aerv_ 


_CLlContig7 


54 


GGSA 


Aerv_ 


CLlContig6 


54 


ggOa 


Hsap_ 


'ggti 


61 


GGSA 



allHv 




m 



31" IHQSDTjgKQS 

3li3l3iQnsttrkaevi 




KETSEYSYK 
RLAFATMFN 



Aerv_GGT 

Aerv_CLlContig2 

Aerv_CLlContig7 

Aerv_CLlContig6 

Hsap_GGTl 




121 SSEQSQ 



lfeptielc j sgyqiskalakai 
lfeptielcj sgyqiskalakai 
lfeptielcqsgyqiskalQkai 



BQBsUQaARQHF P VGHoaijAQ: 



a 

A 

o 

IhoHle 
Hle 



Aerv_GGT 
Aerv_CLlContig2 
Aerv_CLlCont ig7 
Aerv_CLlCont ig6 
Hsap_GGTl 




180 NKRTV 




igjY gjiG 

TMY iWc! 

kBJ ITDgKr 

RElWaMPKE^gnSOS 

CEvg crdrkvlr 




Aerv_OOT 

Aerv_CLlCont ig2 
Aerv_CLlCont ig7 
Aerv_CLlCont ig6 
Hsap_GGTl 



Aerv_GGT 

Aerv_CLlContig2 

Aerv_CLlContig7 

Aerv_CLlContig6 

Hsap_GGTl 



plalalni id 
plalalni id 
plalalnQid 





Is 
vgs 

ANP 

iFNFTPSJSijNGT 



295 gQnFSRESVeQpEQKgI 



Aerv_ 


GGT 


335 


Aerv_ 


CLlCont ig2 


335 


Aerv_ 


[CLlContig7 


335 


Aerv_ 


CLlCont ig6 


352 


Hsap_ 


GGTI 


353 




Aerv_ 


GGT 


394 


Aerv_ 


CLlContig2 


394 


Aerv_ 


[CLlContig7 


394 


Aerv_ 


CLlCont ig6 


411 


Hsap 


[ggti 


412 



Aerv_GGT 
Aerv_CLlCont ig2 
Aerv_CLlCont ig7 
Aerv_CLlCont ig6 
Hsap_GGTl 




Aerv_ 


GGT 


502 


K 


Aerv_ 


|cLlContig2 


502 


K 


Aerv_ 


CLlCont ig7 
CLlContig6 


502 


G 


Aerv_ 


530 


G 


Hsap_ 


|ggti 


531 


H 




sFWaK iaNGvnN>P 

ijQIASTFIAVVQiiJiVlijTAGGWAAASlj 

Figure 3 IVIultiple alignment of y-GJ sequences. Tlie tiiree A ervi y-GJ sequences identified were aligned with tlie publislied A ervi y-GY sequence 
[GenBank: CAL69624] and the human y-GTI sequence [Swiss-Prot:P19440]. Residues identical or similar are highlighted in black and grey, respectively. 
Stars indicate mutations in the Aerv_CL1Contig6 that were described to affect the enzymatic activity of human y-GTl. Aerv, A.ervi; Hsap, H sapiens. 
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Bler_XP_003394680 

1000 

Bimp.,XP_003487336 

AmeLXP_001 121918 
1000 

Aflo_XP_003690896 



■ Mrol_XP_003704006 

1 C(IO_EFN71634 

HsaLEFN8S474 

Aech_EGI57454 

Nvrt_XP_001 602451 



- Aerv_CL1Conli97 



\ 1O0O 



- Aerv_CLlConlig2 
- Aeiv„CL1Contig6 



r Bter_XP_003395637 
1— J993 

gjjl- Bimp_XP_003491426 
— Mrol_XP_003707259 



p AmeLXP.393584 
-\ 1000 
1— AIIO_XP_003696776 



985 
1000 



Aech_EG 159873 
Cflo_EFN69039 
Hsal_EFN81343 



■ Nvit_XP_001601777 
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Figure 4 Maximum-likelihood phylogenetic tree of hymenopteran y-GT sequences. The blue, orange, and green vertical lines correspond 
to the three major clades (A, B and C) obtained for hymenopteran y-GT sequences. A. ervi and N. vitripennis venomous y-GT sequences are 
marked with blue and orange rectangles respectively. Numbers at corresponding nodes are bootstrap support values (1000 bootstrap replicates). 
The outgroup is the human y-GT5 sequence [Swiss-Prot: Q6P531]. Aech, Acromyrmex echinatior; Aerv, Aphidius ervi; Aflo, Apis florea; Amel, Apis 
meiiifera; Bimp, Bombus impatiens; Bter, Bombus terrestris; Cflo, CamponoWs floridanus; Hsal, Harpegnatlios saltator; Hsap, Homo sapiens; Mrot, 
Ivlegachiie rotundata; Nvit, Nasonia vitripennis. 



activity. The five unisequences identified in A. ervi venom 
apparatus libraries, with a total of 97 ESTs (Table 2 and 
Additional file 6: Table S4), encode different SPHs 
(11% to 90% sequence identity), all with mutation(s) 
on the catalytic triad (Additional file 10: Figure S6). The 
four SPHs for which the 5' coding sequence was complete 
contain a signal peptide, suggesting that they are se- 
creted. Based on our criteria, only one of these SPHs 
could be classified as a venom protein: it was consid- 
ered as abundant, with a total of 41 ESTs, and it was 
found in proteomics in the venom reservoir (Table 2 
and Additional file 6: Table S4). Besides, it was specifically 



overexpressed in the venom apparatus (Table 3 and 
Additional file 7: Figure S3). However, two other SPHs 
were also abundant although not found in the reservoir. 
Members of the serine protease family have been described 
in the venom of several other parasitoids [22,25,26,33,34], 
the most studied being Vn50, secreted in Cotesia rubecula 
venom and devoid of serine protease activity. Vn50 
acts as an inhibitor of the hemolymph melanization 
in the host Pieris mpae, presumably by competing with 
host serine protease homologs for binding to proPO, 
while remaining non-cleaved and stable in the haemo- 
lymph [33,35]. 
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Leucine-rich repeat domain-containing proteins 

Two different unisequences encoding leucine-rich re- 
peat (LRR) domain-containing proteins, never described 
yet in a parasitoid venom, were found in our analysis 
with a total of 30 ESTs (Table 2 and Additional file 6: 
Table S4). Interestingly, the sequences were mostly 
found in the French library since only 2 ESTs came 
from the Italian strain (Table 2 and Additional file 6: 
Table S4). The unisequence that was complete contains a 
signal peptide at N-terminus suggesting the secretion 
of the protein. It also contains a total of 8 canonical LRR 
motifs separated by one to three amino acids (Additional 
file 11: Figure S7), although a manual analysis suggested 
the presence of six to seven additional, though cryptic, re- 
peats. Interestingly, the conserved LxxLxLxxNxLxxLxxxxF 
sequence present in the 8 canonical LRR motifs is 
similar to the one described in Toll Like Receptors 
(TLRs) [36]. However, A. ervi predicted proteins only con- 
tain the LRR domain by contrast to the majority of TLRs 
that are multidomain proteins. With the loss of all but 
the LRR domain, A. ervi venom proteins might act as 
scavengers for the pea aphid TLRs, thus impairing the 
host immune response via the Toll pathway. Interestingly, 
the use of a truncated single-domain protein as a virulence 
factor has already been described for a parasitoid venom 
protein [37]. 

Serpins 

Serpins (serine protease inhibitors) are a large family of 
functionally diverse protease inhibitors. They share a con- 
served structural architecture with an exposed reactive cen- 
ter loop (RCL) of about 20 amino acids, which acts as bait 
for target serine proteases [38]. Interestingly, the involve- 
ment of a Leptopilina boulardi venom serpin in suppress- 
ing host immunity was already demonstrated. LbSPNy 
indeed prevents melanization in the Drosophila host 
through inhibition of PO activation [39]. More recently, 
serpins were described in the venom of Hyposoter didy- 
mator [20] and M. demolitor [17] but their role in parasit- 
ism success remains unknown. The two identified A. ervi 
serpin-like unisequences were both found in the French li- 
brary only and detected in the venom reservoir (Table 2 
and Additional file 6: Table S4). However, only one of these 
unisequences, overexpressed in the venom apparatus 
(Table 2 and Additional file 7: Figure S3) could be consid- 
ered as abundant with 25 ESTs. Interestingly, both 
serpins lack the consensus hinge sequence (Additional 
file 12: Figure S8) essential for the conformational change 
involving the RCL and necessary to inhibit the target 
protease [38]. The identified venom protein thus prob- 
ably belongs to the group of non-inhibitory serpins that 
have varied roles such as chaperones or transport mole- 
cules [40]. 



Endoplasmin 

Endoplasmin, which belongs to the heat shock protein 
90 family, is a molecular chaperone located in the endo- 
plasmic reticulum (ER) and involved in the final process- 
ing and export of secreted proteins [41]. Although three 
incomplete endoplasmin-like unisequences were identi- 
fied in A. ervi, they match to different regions of the 
same N. vitripennis endoplasmin sequence (Additional 
file 13: Figure S9) and thus likely correspond to a single 
gene. The endoplasmin protein was considered as abun- 
dant based on the number of ESTs and accordingly de- 
tected at high levels in A. ervi venom reservoir (Table 2 
and Additional file 6: Table S4). The A. ervi sequence 
contains the C-terminal HEEL motif that normally pre- 
vents secretion of ER-resident proteins (Additional file 13: 
Figure S9). However, this ER retention is not absolute [42]. 
Endoplasmin has never been described yet in any parasit- 
oid venom but it has been associated with the secretion of 
pancreatic lipases and their further internalization by in- 
testinal cells [43]. This suggests a possible role of this 
chaperone in the secretion, stabilization, transport and 
host cell targeting of the different A. ervi venom proteins. 

Two other unisequences having putative functions in 
venom were found in low abundance in A. ervi based on the 
number of ESTs: (i) 1 elongation factor and (ii) 1 neprilysin- 
like protein (Table 2 and Additional file 6: Table S4). 

Elongation factor 

One transcript of elongation factor 2 (EF-2), an essential 
protein that regulates the process of polypeptide elong- 
ation during translation, was found in low abundance in 
the A. ervi French library (Table 2 and Additional file 6: 
Table S4). Although EF-2 was also found in the reservoir 
(1 peptide match), the sequence was not complete and 
accurate prediction of its secretion could not be per- 
formed. To our knowledge, there is no report yet of EF- 
2 involvement either as a virulence factor or a venom 
protein. Interestingly, elongation factor 1-alpha (EF-la) 
was found in the venom of another parasitoid, L. hetero- 
toma [18], but its role in the host-parasitoid interaction 
is also unknown. EF-la was identified as a secreted 
candidate virulence factor in Leishmania protozoan 
parasites, being possibly involved in the induction of 
macrophage deactivation through direct binding and 
activation of a specific host tyrosine phosphatase [44] . 

Neprilysin-like (NEP-like) 

One unisequence encoding a neprilysin-like protein was 
found in low abundance (2 ESTs and 2 peptide matches in 
the venom reservoir) in the A. ervi French library (Table 2 
and Additional file 6: Table S4). NEP-like proteins are zinc- 
dependent metalloproteases (ectopeptidases) belonging to 
the Ml 3 peptidase family. They are involved in the degrad- 
ation of a number of regulatory peptides in the nervous or 
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immune system of mammals [45] and insects [46]. Al- 
though typically membrane-bound, ectopeptidases such as 
NEP may also be shed from the membrane through a pro- 
teolytic process and found in the surrounding fluid [47]. 
NEP-like proteins were detected in the venom of the 
parasitoids L. boulardi [18], Microctonus hyperodae [19], H. 
didymator [20] and M. demolitor [17], and were also found 
associated with the VLPs produced in the ovary of V. 
canescens [48]. Although the role of soluble ectopeptidases 
is still not understood, NEP-like proteins have been hy- 
pothesized to modulate the host immune system by de- 
grading immune-specific peptides [48] . 

Expression of genes encoding cystein-rich toxin-iike pep- 
tides In A. ervi venom gland 

Three unisequences coding for small cysteine-rich 
peptides predicted to be secreted were found in our 



transcriptomic analysis and demonstrated to be spe- 
cifically expressed in the venom apparatus (Table 3 and 
Additional file 7: Figure S3). Although two of these, 
CLlContig4 (60 ESTs) and CLlContigl (33 ESTs) were 
considered as abundant, the small molecular weight of 
the predicted mature peptides (from 2.83 to 3.88 kDa; 
Table 4) precluded their analysis by SDS-PAGE proteo- 
mics (Additional file 6: Table S4). 

BLAST hits were obtained with several small animal 
toxins for the three unisequences, although E-values were 
not highly significant due to the size of the peptide se- 
quences (Table 4). The toxin-like function of these unise- 
quences was further confirmed using ClanTox (Table 4). 
Interestingly, multiple alignment revealed a highly con- 
served signal peptide sequence, suggesting a common 
evolutionary origin for the three peptides (Figure 5). By 
contrast, the sequences of the predicted mature peptides 



Table 4 Summary of the toxin-like peptides analysis and comparison with defensin-NV (Ye et al. [51]) 



CL1Contig4 



CLlContigl 



CLIContigS 



Defensin-NV 



Number of ESTs 
Total 

FR 
IT 

Signal peptide 
TargetP 
Reliability 
Sequence length 
Complete (aa) 
Mature (aa) 
Molecular weight 
Mature (kDa) 
Similarity searches 
Swiss prot best hit 
Accession 
Organism 
Molecular function 
Domain 
E-value 

Toxin prediction 

ClanTox 

Reliability 

Knottin prediction 

KnoterlD 

AMP Prediction 

AMPer 

Lowest HMM E-value 

ClassAMP 

Probability 



60 
37 
23 

Yes 

Strongest 

60 
36 



U8-theraphotoxin-Cj1a 

B1P1C0 
Chilobrachys jingzhao 
Toxin 
Knottin 
0.026 

Toxin-like 
Strongest 

Ambiguous knottin 

Antimicrobial peptide Alo-3 
0.0016 
Antibacterial 
0.512 



33 
27 
6 

Yes 

Strongest 

60 
36 

3.79 

Conotoxin Vil 1 .3 
C7DQX8 
Conus vitulinus 
Toxin 
Knottin 
2.9 

Toxin-like 
Strongest 

Ambiguous knottin 

Beta-defensin 

0.005 
Antibacterial 

0.556 



6 
3 

Yes 

Strongest 

51 
27 

2.83 

Conotoxin AbVIN 

Q9TVQ6 
Conus abbreviaWs 

Toxin 

Knottin 
0.34 

Toxin-like 
Strongest 

Putative knottin 

Beta-defensin 
0.0054 
Antifungal 
0.366 



Defensin-1 
Q5J8R1.1 
Apis mellifera carnica 
Defensin 

2e-21 

Toxin-like 
Strongest 

Not a knottin 

Defensin 
2e-20 
Antibacterial 
0.806 
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Figure 5 Multiple alignment of toxin-like sequences. The three A. ervi toxin-like sequences were aligned with the mature peptide sequence 
corresponding to each BLAST best hit (SwissProt database). Residues identical or similar are highlighted in black and grey, respectively. The predicted 
signal peptide is underlined in red, the six conserved cysteine residues are identified by red stars. Theraphotoxin: U8-theraphotoxin-Cj1a from 
Chilobrachys jingzhao [Swiss-Prot: BIPICO]; Conotoxin_\/i: Conotoxin Vil 1.3 from Conus vitelinus [Swiss-Prot: C7DQX8]; Conotoxin_Ab: Conotoxin 
AbVIN from Conus abbreviatus [Swiss-Prot: Q9TVQ6]. 



were strongly divergent, except for the conservation of six 
cysteine residues that may form three stabilizing disulfide 
bridges. These cysteine residues are conserved in each of 
the best BLAST hit for the three sequences (Figure 5), the 
corresponding peptides being classified as knottins, ex- 
tremely stable small disulfide-rich proteins with a knotted 
topology [49]. Remarkably, the three A. ervi toxin-like 
peptides also possibly correspond to knottins, never de- 
scribed in any parasitoid venom to date, although the pre- 
diction was not fully supported (Table 4). 

Few peptides have been characterized to date from 
parasitoid venoms. One of them is Vnl.5, a short peptide 
of 14 amino acids required for the expression of Cotesia 
rubecula polydnaviruses in Pieris rapae host hemocytes 
and the following inactivation of these hemocytes [50]. 
The Nasonia vitripennis venom analysis predicted occur- 
rence of several cysteine-rich peptides with a protease 
inhibitor motif [22]. However, functional data on these 
peptides are lacking and only one defensin-like anti- 
microbial peptide, defensin-NV, was purified from N. 
vitripennis venom [51]. Defensin-NV is a 52 amino acid 
peptide with six cysteines forming three disulfide brid- 
ges that has strong antimicrobial activity against wide 
spectrum microorganisms, but which is not predicted 
as a knottin (Table 4). Another N. vitripennis defensin- 
like peptide, nasinin-3, with a similar structure but no 
antimicrobial activity was recently demonstrated as 
a potential inhibitor of host hemocytes' melanization 
in vitro. It is however unclear whether it is found in 
venom [52]. Interestingly, the three A. ervi toxin-like 
peptides also share weak similarities with defensin-like 
antimicrobial peptides (Table 4), but their possible role 
as an antimicrobial factor or an inhibitor of melaniza- 
tion remains to be assessed. 

Conclusions 

This paper reports the first identification of the main 
putative venom proteins of a parasitoid of aphids, A. ervi, 
using the same combined large-scale transcriptomic and 
proteomic approach we successfully used previously [18]. 
The analysis focused on a restricted number of proteins 
based on their predicted abundance and the occurrence of 
proteomic matches both in venom gland and reservoir. A 



total of 16 putative venom proteins were considered, a low 
number compared to other analyses [18-26], suggesting 
possible occurrence of additional low-abundant venom 
proteins in A. ervi. However, this conservative approach 
largely precluded misidentification of cellular proteins as 
venom factors. Interestingly, 12 out of the 16 considered 
proteins could be assigned a predicted function, in con- 
trast to the majority of putative venom proteins in large 
broad analyses that did not display similarity to any 
known protein [18-26]. The combined analysis of two 
datasets corresponding to different A. ervi strains (French 
and Italian) confirmed that the major venom proteins 
are shared by different parasitoid populations. However, 
it also identified striking differences in the abundance of 
transcripts for some of the main unisequences such as 
the y-GTs, suggesting variations in allele frequency and/ 
or gene expression level among populations that remain 
to be explored. 

Our study confirmed the identification of Ae-y-GT as 
the most abundant protein by far in A. ervi venom, thus 
supporting its role as a key player in parasitism success 
[13]. In addition to an allelic form of Ae-y-GT, we iden- 
tified a divergent, possibly non-functional, y-GT, whose 
biological function, if any, remains to be explored. Inter- 
estingly, we recently identified a multigenic family for a 
venom protein of a parasitoid of Drosophila, with all 
members except one mutated in one or more essential 
amino acids [18]. y-GTs have also been observed in the 
venom of the ectoparasitoid N. vitripennis, although no 
information is available regarding their abundance and 
function. Our data nevertheless add y-GTs as a new ex- 
ample of independent convergent recruitment of venom 
proteins in evolutionary distant parasitoid species. 

Among the abundant putative venom proteins, serpins 
and SPHs were described in venom of other braconids 
and more distant parasitoid wasps, further suggesting 
occurrence of a conserved subset of venom proteins 
across parasitoid species [21,25,53]. 

Other putative venom proteins were unique to A. ervi, 
including endoplasmin or LRR domain-containing pro- 
teins, suggesting a rapid evolution of some venom com- 
ponents. Finally, occurrence of toxin-like cystein-rich 
peptides was predicted in some parasitoid species but 
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the diversity of their nature and function remains to be 
explored. 

One main challenge will be now to decipher the bio- 
logical function of the identified venom proteins and their 
role in the parasitism success of A. ervi. This might be 
performed using the RNAi technique, as RNAi-mediated 
complete extinction of a venom protein was recently evi- 
denced in an endoparasitoid wasp [54]. Results will open 
the way to a better understanding on aphid-parasitoid im- 
mune and nutritional parasitoid interactions. 

Methods 

Insect rearing 

The French (FR) A. ervi strain was produced by mass- 
rearing the progeny of individuals emerged from aphid 
mummies provided by Biobest (Ervi-M-sytem, Orange, 
France). Parasitoids have been since maintained in cages 
on the aphid A. pisum LLOl clone raised on fava bean, 
under a 16:8 h light/dark cycle, at 18°C. The LLOl clone 
hosts Buchnera aphidicola but it is devoid of secondary 
symbionts. The Italian (IT) strain of A.ervi was reared 
on A. pisum maintained on potted fava bean plants, 
under the same environmental conditions as described 
above. Both host and parasitoid colonies were started 
within insects collected in Southern Italy (Eboli, SA), 
which were periodically bred over the years with field 
material originating from the same area. No ethical ap- 
proval is needed for experimental work on insects such 
as the wasp Aphidius ervi. 

Collection of venom apparatus, total RNA isolation and 
cDNA libraries construction 

The transcriptomic analysis was performed on A. ervi 
venom apparatus, corresponding to venom glands and 
their associated reservoirs (Additional file 1: Figure SI). 
Venom apparatus were dissected in Ringer's saline (KCl 
182 mM; NaCl 46 mM; CaCla 3 mM; Tris-HCl 10 mM) 
and stored at -80°C. Total RNA was extracted for each 
strain from 100 venom apparatus, using TRIzol Reagent 
(Invitrogen) according to manufacturer's instructions. 
cDNA libraries were constructed from 1 \ig of total RNA 
using the Creator SMART cDNA Library Construction 
Kit (Clontech). Ligation products were transformed into 
ElectroMax DHIO B Escherichia coli competent cells 
(Invitrogen). 

Sequencing, EST processing and assembly 

A general overview of the sequence data processing is 
given in Additional file 1: Figure SI. Sanger sequencing 
was done on an ABI sequencer using the standard M13 
forward primer and BigDye terminator cycle sequencing 
kit (Applied Biosystems, Foster City, CA, USA). Follow- 
ing a primary step of analysis of 384 clones to check the 
quality of each library and confirm the presence of 



venom protein-related sequences, a total of 6,000 clones 
were sequenced, 5,000 for A. ervi FR and 1,000 for A. 
ervi IT. FR ESTs were processed using SURF analysis 
pipeline tools (SURF: SeqUence Repository and Feature 
detection) as previously described [18]. IT ESTs were 
trimmed using TIGR SeqClean software. High quality 
trimmed ESTs longer than 100 bp from FT and IT li- 
braries were then pooled and assembled into contigs 
using the TIGR-TGICL tool with different parameters. 
Based on the test results, the final assembly was the one 
performed with default parameters [55]. 

Sequence annotation and analysis 

To identify similarities with known proteins, the se- 
quences of contigs and singletons were compared using 
the blastx algorithm against local non-redundant NR 
(NCBI, 2012-10-25), UniProtKB/Swiss-Prot (SIB, 2012- 
10-21), and insect predicted protein/proteome databases 
{Acromyrmex echinatior v3.8, Acyrthosiphon pisum v2.1, 
Aedes aegypti vl.3. Anopheles gambiae v3.6. Apis mellifera 
v4.5, Bombyx mori, Drosophila melanogaster v5.46, Dros- 
ophila pseudoobscura v2.28, Nasonia vitripennis vl.2 and 
Tribolium castaneum v20051011) with a cut-off E-value 
of le-7. ORE prediction and translation were performed 
using FrameDP software [56] (available at http://iant. 
toulouse.inra.fr/FrameDP/). Signal peptide prediction was 
obtained using TargetP (available at http://www.cbs.dtu.dk/ 
services/). Search for protein domains was performed using 
InterProScan. Gene functions and GO terms were auto- 
matically assigned to the predicted proteins based on 
the identification of InterPro domains with InterProS- 
can. Only the root domain of the hierarchical domain 
organization available from EBI was conserved. Com- 
parison of GO terms between FR and IT contigs and 
homogenization of the annotation level were performed 
using the GO slim. 

Multiple amino acid sequence alignments were per- 
formed using MUSCLE [57]. For phylogeny, search for 
Hymenopteran y-GT sequences was performed using 
BLASTP at NCBI (http://www.ncbi.nlm.nih.gov/blast/). 
Identification of N. vitripennis NV24088-PA sequence 
was performed using HMMsearch from the HMMER 
package [58] with the G_glu_transpept (PF01019) HMM 
profile on N. vitripennis vl.2 proteome database. Phylo- 
genetic analysis of y-GT amino acid sequences was per- 
formed using maximum likelihood (ML) with PhyML 
[59]. ProtTest [60] was used to select the best-fit model 
of amino acid substitution for ML phylogeny. Leucine- 
rich repeats (LRRs) were predicted using ScanProsite 
(http://au.expasy.org/prosite/). Toxin prediction was per- 
formed using ClanTox available at http://www.clantox. 
cs.huji.ac.il/ [61]. Knottin prediction was performed 
using KnoterlD available at the KNOTTIN database 
(http:/ /knottin.cbs.cnrs.fr). 
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SDS-polyacrylamide gel electrophoresis of venom and 
protein identification 

The proteomic analysis was performed independently on 
wasp venom glands and reservoirs (Additional file 1: 
Figure SI). Forty A. ervi female venom apparatus were 
dissected and reservoirs were separated from the glands. 
Glands and reservoirs were then independently collected 
in 25 |il of Ringer's solution supplemented with a protease 
inhibitor cocktail (Sigma). Reservoirs were solubilized im- 
mediately by mixing with 4x Laemmli buffer containing 
fi-mercaptoethanol (v/v), while glands were squeezed to 
release the venom content. The glands suspension was 
then centrifuged for 2 min at 500 g to remove the residual 
tissues, and the supernatant was carefully collected and 
mixed with 4x Laemmli buffer containing p-mercap- 
toethanol (v/v). Both reservoir and gland samples were 
boiled for 5 minutes. Proteins were then separated on a 6- 
16% linear gradient SDS-PAGE and the gel was silver 
stained [62]. Identification of proteins by mass spectrom- 
etry was performed on bands excised from the gel, cut 
into small blocks, and rinsed with water and acetonitrile 
prior to reduction and alkylation. Sample trypsinization 
was then carried out overnight at 37°C with 12.5 ng/|il 
trypsin (sequencing grade, Sigma). The generated pep- 
tides were sequenced by nano-LC-tandem mass spec- 
trometry (MS/MS) (Q-TOF Ultima with a nano-electrospray 
ionization source, Waters/Micromass, UK) in data- 
dependent acquisition (DDA) mode using the five most 
intense parent ions. The peptides were loaded on a C18 
column (XBridge™ BEH130 3,5 |im, 75 |im x 150 mm. 
Waters) and eluted with a 5 to 60% linear gradient at a 
flow rate of 200 nl/min over 90 min (buffer A: water/ 
acetonitrile (98:2, v/v) and 0.1% formic acid; buffer B: 
water/acetonitrile (20:80, v/v) and 0.1% formic acid). MS/ 
MS data analysis was performed with the Mascot software 
(http://www.matrrxscience.com) licensed in house, using 
the contig sequences of the Aphidius mixed library and 
non-redundant NR (NCBI). Peak lists generated for indi- 
vidual bands from the same gel lane were merged together 
into a single file before databank search submission. Data 
validation criteria were (i) one peptide with individual ion 
score above 50 (the mascot significant identity threshold 
corresponding to p < 0.005 is 38 in our case) or (ii) at least 
two peptides of individual ion score above 20 (corre- 
sponding to 1% probability that a peptide spectrum match 
is a random event). The mascot score was calculated as 
-lOLog(P). The calculated FDR (based on an automatic 
decoy database search) was lower than 1%: FDR = 0.23% 
and 0% for venom glands and reservoirs respectively. 

Quantitative real-time RT-PCR 

Total RNA was isolated either from dissected venom 
apparatus or from the rest of the female bodies (without 



venom-producing tissues) using the TRIzol reagent 
(Invitrogen), and reverse-transcribed using the iScript 
cDNA Synthesis Kit (BioRad). qPCR reactions were 
then carried out on an Opticon monitor 2 (BioRad) 
using the Absolute qPCR SYBR MasterMix Plus for 
SYBR Green I No ROX (Eurogentec). Primer pairs are 
listed in Additional file 14: Table S5. PGR conditions 
were as follows: 50°C for 2 min, 95°C for 10 min, and 
40 cycles of 95°C for 30 s, 60°C for 30 s and 68°C for 
30 s. Each reaction was performed in triplicate and the 
mean of three independent biological replicates was 
calculated. All data were normalized using RPL19 and 
RPL23 as controls and results were analyzed using 
Qbase Software (Ghent University, Ghent, Belgium). 

Availability of supporting data 

All trimmed ESTs for A. ervi FR and A. ervi IT are avail- 
able in the NCBI dbEST repository with the following 
accession numbers: JZ569599 - JZ573851. The assembled 
transcripts corresponding to putative venom proteins have 
been deposited in GenBank under Transcriptome Shotgun 
Assembly accession number GBCUOOOOOOOO. The version 
described in this paper is the first version, GBCUOIOOOOOO. 

Additional files 



Additional file 1: Figure SI. Schematic representation of the 
combined large-scale transcriptomic and proteomic approach. Upper 
picture; venom apparatus of A. ervi. VG: venom gland; R: reservoir; 
DG: Dufour gland. 

Additional file 2: Table SI. General features of the A. ervi cDNA FR and 
IT libraries, results of assembly of pooled FR and IT sequences and 
similarity searches. 

Additional file 3: Figure S2. Interlibrary comparison of the representation 
of GO categories. Distribution of the number of unisequences associated 
with GO terms for the FR and IT libraries. The difference in the number of 
FR and IT sequences was taken into account using the ratio of the number 
of trimmed sequences between FR and IT. 

Additional file 4: Table S2. IVlost abundantly represented transcripts 
(>10 ESTs) in the A en/ZcDNA libraries. Mixed contigs are highlighted in gray. 

Additional file 5: Table S3. IVlatrix scores and peptides identified by 
mass spectrometry (R). 

Additional file 6: Table S4. Unisequences found in proteomics. Mixed 
contigs are highlighted in gray. 

Additional file 7: Figure S3. Mean relative expression in venom 
apparatus and bodies without venom apparatus. qRT-PCR experiments 
were performed for a selection of unisequences coding for putative 
venom proteins and toxin-like peptides. All data were normalized using 
RPL19 and RPL23 controls. 

Additional file 8: Figure S4. Specific peptides identified in proteomics 
for the three A. ervi venom y-GTs. The specific peptides identified for 
CLlContig7, CLlContig2 and CLlContig6 are indicated in red. 

Additional file 9: Figure S5. Partial multiple alignment of y-GT 
sequences. The three A. ervi y-GT sequences identified were aligned with 
related hymenopteran y-GT sequences from the same clade (clade A in 
Additional file 4: Figure 4) and the human y-GTl sequence [Swiss-Prot: 
PI 9440]. The part of the multiple alignment displayed in the figure contains 
the mutations in the Aerv_CLl Contig6 that were described to affect the 
enzymatic activity of human y-GTl. Mutations are indicated with stars and 
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red letters. Residues identical or similar are highlighted in black and grey, 
respectively. Aech, Acromyrmex echinatior, Aerv, Aphidius ervi; Aflo, Apis florea; 
Amel, Apis mellifera; Bimp, Bombus impatiens; Bter, Bombus terrestris; Cflo, 
Camponotus flondanus; Hsal, Harpegnathos saltator, Hsap, Homo sapiens; 
Mrot Megachile rotundata; Nvit, Nasonia vitripennis. 

Additional file 10: Figure S6. Multiple alignment of the five A. ervi 
serine protease homologue sequences. Residues identical or similar are 
highlighted in black and grey, respectively. Letters in red indicate 
residues of the catalytic triad (His, Asp and Ser) for which mutations are 
found in A. ervi serine protease homologue sequences. 

Additional file 11: Figure S7. Multiple alignment of LRR 
domain-containing sequences. Residues Identical or similar are 
highlighted in black and grey, respectively. The predicted signal peptide is 
underlined In red. The 8 canonical LRR motifs are underlined in blue. 

Additional file 12: Figure S8. Multiple alignment of serpin sequences. 
The two A ervi serpin sequences Identified were aligned with H. 
didymator Hd-Ven390 [20] and L boulardi LbSPNy [EMBL: ACQ83466.1] 
venom serpin sequences. Residues identical or similar are highlighted In 
black and grey, respectively. The hinge region is underlined in red. 

Additional file 13: Figure S9. Multiple alignment of endoplasmin 
sequences. The three A. ervi endoplasmin-like unisequences were 
aligned with N. vitripennis endoplasmin [GenBank: XP_001 599282.1]. 
Residues Identical or similar are highlighted In black and grey, respectively. 

Additional file 14: Table S5. Primer pairs used for qRT-PCR experiments. 
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EF-lo: Elongation factor 1 -alpha; EF-2: Elongation factor 2; ER: Endoplasmic 
reticulum; EST: Expressed sequence tag; ER: French strain of A. ervi; y-GT: y 
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